Combining Pattern-Based and Distributional Similarity for Graph-Based Noun Categorization
نویسندگان
چکیده
We examine the combination of pattern-based and distributional similarity for the induction of semantic categories. Pattern-based methods are precise and sparse while distributional methods have a higher recall. Given these particular properties we use the prediction of distributional methods as a back-off to pattern-based similarity. Since our pattern-based approach is embedded into a semi-supervised graph clustering algorithm, we also examine how distributional information is best added to that classifier. Our experiments are carried out on 5 different food categorization tasks.
منابع مشابه
An Optimal Approach to Local and Global Text Coherence Evaluation Combining Entity-based, Graph-based and Entropy-based Approaches
Text coherence evaluation becomes a vital and lovely task in Natural Language Processing subfields, such as text summarization, question answering, text generation and machine translation. Existing methods like entity-based and graph-based models are engaging with nouns and noun phrases change role in sequential sentences within short part of a text. They even have limitations in global coheren...
متن کاملLearning compound noun semantics
This thesis investigates computational approaches for analysing the semantic relations in compound nouns and other noun-noun constructions. Compound nouns in particular have received a great deal of attention in recent years due to the challenges they pose for natural language processing systems. One reason for this is that the semantic relation between the constituents of a compound is not exp...
متن کاملNoun Sense Induction and Disambiguation using Graph-Based Distributional Semantics
We introduce an approach to word sense induction and disambiguation. The method is unsupervised and knowledge-free: sense representations are learned from distributional evidence and subsequently used to disambiguate word instances in context. These sense representations are obtained by clustering dependency-based secondorder similarity networks. We then add features for disambiguation from het...
متن کاملUsing Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality
We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...
متن کاملA Combined Pattern-based and Distributional Approach for Automatic Hypernym Detection in Dutch
This paper proposes a two-step approach to find hypernym relations between pairs of noun phrases in Dutch text. We first apply a pattern-based approach that combines lexical and shallow syntactic information to extract a list of candidate hypernym pairs from the input text. In a second step, distributional similarity information is used to filter the obtained list of candidate pairs. Evaluation...
متن کامل